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ABSTRACT 

Marascuilo and Levin’s (1970) notion of Type IV 
errors is extended, with respect to the interpretation of 
interactions in analysis of variance (ANOVA) designs. To help clarity 
what an interaction is and what it is not, in terms of the ANOVA 
model, the following points are made: (i) interactions should be 

thought of as linear contrasts involving particular cell means; (ii) 
such contrasts may be noth specified and directional, in that they 
may be defined to test an investigator’s a priori hypotheses; and 
(iii) even the layman’s conceptualization of interactions fits nicely 
into the ANOVA model. (Author) 
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Interactions Revisited * 

The rejection or denial of a true statistical hypothesis and 
the nonrejection or acceptance of a false hypothelsis are abstract- 
ions with which researchers In the behavioral sciences generally 
make only passing acquaintance. While one is taught to express 
concern for — and even to compute probabilities associated with — 
Type I and Type II errors as a student attending a first course 
in statistics, the same student in the "cruel world" of research 
experiences understandable difficulty in deciding when either of 
these errors has in fact occurred. As a result, one gradually 
learns to live with them. Such should not be the case with Type 
IV errors, as introduced by Marascuilo and Levin (1970), since 
with practice this kind of error is easily recognized and con- 
sequently avoided. 

According to the definition of Marascuilo and Levin, a Type 
IV error is said to occur whenever a correct statistical test has 
been performed, but is then followed by analyses and explanations 
which are not related to the statistical test used to decide whether 
the hypothesis should or should not have been rejected. More 
succinctly, a Type IV error is made whenever a researcher offers 
an incorrect interpretation to a correctly rejected statistical 
hypothesis. Less succinctly, a Type IV error is identified as 
having been committed whenever a researcher concludes, on the basis 
of an appropriately performed statistical test, that there is a 
reliable source of variability in the data, but then proceeds to 
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specify the locus of the effect with an eyeball interpretation of 
the data or by employing post hoc multiple comparison procedures 
which are not congruent with the hypothesis initially tested, and 
which may not even correspond to the underlying model upon which 
the statistical test was based. 

Type IV Errors in the One-Way Analysis of Variance Modal 

Perhaps the most commonly encountered Type IV error is the one 
committed by a researcher who follows a rejected anlaysis of variance 
(ANOVA) hypothesis with a set of overlapping multiple t- tests, each 
performed at the same alpha level as chosen for the original JF-test. 
In this case, the statistical hypothesis HqJ " V 2 " V 3 " ••• " Vj 
has been rejected with the probability of a Type I error set equal to 
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If the researcher now examines each of the Q * 



1 ( 1 - 1 ) 



paired 



mean comparisons at the same o level as used for the J?-test, the total 
probability of at least one Type I error in the set of comparisons 
is inflated far above the original probability to a maximum value 
of (Xq ± Qo • Should this procedure pronounce statistically sig- 
nificant certain pairwise comparisons that would not have been 
Identified with the "appropriate" post hoc Scheffe" (1953) method, 

3 

then a Type IV error would have been made. 

Some researchers may attempt to control the maximum probability 
of at least one Type I error in the set of Q pairwise contrasts 
(or in the set of any K planned comparisons, i.e., contrasts involving 
linear combinations of means, as well as pairwise comparisons) by 
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using Bonferroni critical values as described by Miller (1966) or 
Dunn (1961) to conduct their post hoc investigations. However, even 
these adjustments do not eliminate Type IV errors since they are 
not related to the original F-test in a one-to-one manner. Because 
the Bonferroni procedures are quite powerful — especially when 
the number of comparisons of interest is relatively small — they 
may detect a greater number of significant differences than would 
the Scheffe" method, which corresponds exactly to the classical 
F-test. 

It is worth mentioning that Type IV errors of this kind may 
be avoided simply by bypassing the F-test altogether. In point 
of fact, abandoning the F-test might be the optimal strategy for a 
researcher to follow if the plan is to examine only a small number 
of contrasts, since the F-test could lead to a nonrejection of H Q 
while one or more comparisons could be identified as significant 
with the more powerful Bonferroni or Dunn method. In such cases, 
the relative power of the Scheffe" to the Dunn procedure (defined, 
perhaps, in terms of the ratio of the respective critical values) 
may be determined prior to data collection, in order to reach a 
rational decision concerning the approach to adopt (see Davis, 1969). 

It is also worth noting that in the equal sample size model, 
employing the Tukey (1953) method of pairwise comparisons following 
the rejection of Hq based on the classical F-test may also produce 
Type IV errors, since Tukey* s procedure is more likely to identify 
pairwise differences as significant than is Scheffe >f s. The reason 
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for this potential discrepancy is that the two procedures are 
derived from different underlying distributions. Tukey’s more 
powerful method is based on the distribution of the studentized 
range for which the corresponding test statistic is the ratio of 
the weighted (by the square root of the common sample size) max- 
imum mean difference to the square root of the mean square within, 
Scheffer’s procedure, on the other hand, is based on a different 
mathematical model for which the test statistic is the familiar 
ratio of the mean square between to the mean square within in the 
one-way ANOVA model. Thus, if a researcher is interested in per- 
forming only pairwise contrasts, and if sample sizes are equal, then 
the F-test is not the "appropriate" test to perform \ In this case, 
the researcher should first perform the studentized range test and 
if it leads to rejection of Hq, then Tukey’s method of pairwise 
comparisons should be employed (Dixon and Massey, 1969; Scheffe", 
1959). With this strategy, the probability of making a Type IV 
error is reduced to zero. 

Type IV Errors i n Other Models 

Cautions regarding the Incorrect post hoc analysis of correctly 
performed omnibus tests are not confined to traditional ANOVA 
designs. Marascuilo (1966) has described simultaneous inference 
procedures which are "appropriate" for large-sample tests of the 
differences among J Independent proportions and among J independent 
correlation coefficients. Steel (1961), Dunn (1964), and Marascuilo 
and McSweeney (1967) have developed multiple comparison techniques 
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to accompany the nonparametric rank tests. The essential feature 
of such post hoc methods is that they are based on the same 
distribution as the test statistic , and therefore will yield 
information which is congruent with the test initially performed. 

As might be surmised, multivariate ANOVA hypotheses offer the 
researcher a Pandora r s box of very elegant and sophisticated 
Type IV errors (among others) that may result in unwarranted ex- 
planations of significant findings. One common error following a 
rejected multivariate hypothesis is to perform variable by variable 
comparisons (perhaps at a reduced o level, or using post hoc 
univariate techniques), or to interpret the significant multivariate 
statistic in terms of linear combinations of dependent variables, 
as might be suggested from an examination of principal components 
or linear discriminant functions. While there may be some corres- 
pondence between the decisions made under these analysis procedures 
and the "appropriate” Roy-Bose multivariate post hoc method as 
described by Morrison (1967), it has been shown by Hummel and Sligo 
(in press) that the Type I error probabilities of the multivariate 
and univariate procedures are not identical. That is, when mul- 
tivariate data are analyzed on a post hoc basis with critical 
values determined from univariate procedures, one is liable to 
arrive at statistical decisions which are different from those 
based on multivariate post hoc techniques. As was mentioned for 
the univariate case, if only a small number of comparisons 
is of Interest, it might be advisable to examine those comparisons 
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individually rather than to perform an overall multivariate jF-test. 

Until now, the discussion of Type IV errors has been basically 
from a Type I error point of view. In other words, situations were 
described in which the correspondence between the Type I error prob- 
ability of the post hoc analyses and that of the test initially 
performed was less than perfect. However, another (perhaps more 
serious) Type IV error occurs when a researcher defines his post 
hoc contrasts in a manner which does not even investigate the 
hypothesis he is presumably testing. This kind of error frequently 
occurs when it comes to interpreting statistical "interactions" in 
contingency tables (Goodman, 1964; Marascuilo, 1966), regression 
analyses (Timm, in preparation), and factorial ANOVA designs 
(Marascuilo and Levin, 1970) . 

Marascuilo and Levin point out that a typical strategy following 
the detection of a significant interaction in a factorial ANOVA 
is to make either pairwise or nested comparisons using the various 
cell means of the design. They showed by examples that this pro- 
cedure is in no way related to the Interaction JF-test initially 
performed. It was further suggested that this error arises because 
many researchers do not have a clear understanding of what con- 
stitutes an interaction as it is defined by the mathematical ANOVA 
model. As a result, the post hoc procedures and/or verbal discussion 
based on the identification of a significant interaction often 
are inappropriate. A further clarification of the meaning of a 
statistical interaction will be attempted in the sections which follow. 
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Mathematical Model for an I by J Factorial Design 

Consider a two-way fixed effects ANOVA model with an equal 
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number of observations per cell. For I rows and J columns, the 
model may be written as: 



'ijk * w + + Yy + e 1Jk 



where: 

y ijk 

P 

CL. 



the value of the k-th observation in the i-th row and 
j-th column 

a fixed constant that centers the data 

I 

the effect of Level i for Factor A where £ a. « 0 

i*l 1 
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U 

. ■» the effect of Level j for Factor B where Y $ . 

J jol j 




* the joint effect of Level i and Level j where 




e 



ijk 



■ the error associated with the k-th observation in the 
ij-th cell* 



The are assumed to be statistically independent, normally 

distributed, with a mean of zero and a variance equal to a 2 * 

Under this model it is customary to tect for the presence of 
row effects, column effects, and interaction effects by means of three 
orthogonal tests of hypothesis: 
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H or a i “ a 2 “ ••• " <* x * 0 

h 02 ! H “ b 2 “ ••• = h “ 0 

H 03 : Yll -Y12 ■ ••• * YlJ " 0 

If any of the main effects is signitifant, one may use Scheffer's 
method to identify possible sources of variance in exactly the 
same manner as that used for the one-way ANOVA. However, if the 
interaction teat leads to a rejection of slightly different 
procedures must be used to locate the significant sources that 
account for the rejection of the hypothesis. 

As suggested by Marascuilo and Levin (1970), the common prac- 
tice of making simple comparisons among the cell means to locate 

valid if H q3 is 
instead of testing 
were to test the 

given by: 



» the probability 
case, a typical 

pairwise contrast i9 defined byi 



the sources of a significant interaction is not 
rejected. The procedure is appropriate only if 
H or Hq2> and Hq 3 as orthogonal hypotheses, one 
composite total cell hypothesis: 



H 04 : U 11 * P 12 



‘IJ * 



In this case, the Scheffe" coefficient would be 



S « J (IJ-l)F 



(IJ-1) ,IJ(n-l) 



(1-a) 



where n *=* number of observations per cell and a 
of a Type I error associated with Hq^. In this 
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* (u + « 1 + + y^) ~ (y + a ± A &y+ yyy) 

=■ (c^ - a ± ^) + ($j - 3j^) + (y^ ~ Y±*y) 

Note, however, that if this contrast were found to be 
significant, it would be impossible to know whether or not the 
difference was due to the fact that ^ a^.., or that 3j ^ 3j * * 

or that Y^j 4 y±*y * or an y combination of these. In words, 
contrasts of this type lead to a confounding of the model’s para- 
meters, and the situation is not alleviated simply by testing 
Hq^ instead of H^. 

On the basi 3 of this discussion it should not be concluded that 
there is no appropriate way to interpret the meaning of a significant 
jF-ratio for interaction, in terms of some linear combination of 
cell means. Nhat the previous discussion is meant to suggest is 
that linear contrasts of the form ¥ ° - V^+y that are typically 

defined by researchers to interpret interactions are incorrect in 
a Type IV error sense. 

Interaction in the 2 by 2 Design 

The simplest way to consider the problem and identify valid 
interaction contrasts is to reexamine a factorial design with 
I»2 and J“2. For this design, the observed grand, row, column, 
and cell means may be denoted as shown in Table 1. 
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Insert Table 1 about here 



With four cells, there are three degrees of freedom available 
for the between cell hypothesis H 04- While the test of Hq^ could 
be performed, the more usual approach is to take the total sum of 
squares between groups and partition it into three orthogonal 
components each possessing one degree of freedom and each leading to 
an F-test with *= 1 and V 2 0 4(n-l) • Although there is an 
unlimited number of ways that could be used to partition the sum 
of squares between groups, in factorial designs the partitioning 
consists of three very specific orthogonal contrasts. In this case, 
the two orthogonal contrasts for the A and B main effects are 
respectively given by: 

\ " *u - y 2 . “ (+1/2) yn + (+1/2) i r 1 2 + <- 1 / 2 )y 2 x + (_1/2) ^22 

H'b * y # i “ y # 2 “ (+l/2)y 11 + (-l/2)y 12 + (+l/2)y 21 + (-l/2)y 22 

A A 

To generate the third contrast orthogonal to both V. and HL , one 
needs only multiply the coefficients pair by pair and use the resulting 
products as the coefficients for the third contrast. For the 2 by 2 
design, this procedure defines the contrast: 

^ B (+l/4)y 1;L + (“ 1 /4)y 12 + (-1/4) y 21 4* (4-l/4)y 22 . 




11 



11 



It should be noted that the fractional coefficients (within 
contrasts) are not essential to the valid use of orthogonal com- 
parisons. Rather, it is the ratio of the coefficients to one 
another (between contrasts) that must be maintained. If the fractions 
are converted to integers, then the complete collection of linear 
contrasts may be represented by a contrast matrix as shown in 
Table 2, where the rows represent the individual cell means while 

Insert Table 2 about here 



the columns represent the orthogonal contrasts which constitute 
the elements of the factorial ANOVA design. 

Since in the 2 by 2 models ** -c^* 3^ « ~B 2 » an< * 

Y^ - “YjL 2 “ ~^21 * ^22 * s ^ own t * ie ex P ecte <* values 

of each of the Table 2 contrasts contain only the single parameter 

of interest; that is: 



e(J a ) « E(y 1# - y 2 ) « E C(y 1# - y ## ) - (y 2# - y. # )l 

A A 

* E(a^ - a 2 ) “ a i " a 2 



2a, 



E(0 = E(y - - y 9 ) - E[(y - - y ) - (y 2 - y )] 



A A 



E(3 X - B 2 ) * 3 X - B 2 



23 . 
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E( W ’ E <*11 - *12 - *21 + * 22 > 

■ E[( *u - *i. -*.i + *J - ^12 - *i. -*.2 + *..> 

- <*21 ' *2. ‘ *.l + *.. ) + <*22 ' *2. ‘*.2 + *.. )] 

A A A A 

" E( *ll - *12 ' *21 + *22> “ *11 ‘ *12 " *21 + *22 

"Si 

so that the contrasts as defined will produce unconfounded estimates 
of their respective effects* 

Thus, if H Q1 is true, then ■ 0* In like manner, ■ 0 
and ^^g « 0 if and Hq^ are true. This means that in the 2 by 2 
factorial design the hypotheses and are equivalent 

to the hypotheses: 

H 01 S \ ” °> H 02* *B “ °’ and H 03 ! ¥ ab “ 0 

so that hypotheses about equal parameter values are identical to 
hypotheses about contrasts being equal to zero. 

It should be noted that the ANOVA hypotheses Hq^, Hq 2 > and Hq^ 
written as hypotheses about ¥g, and Y^g involve every cell of 
the 2 by 2 design. As will be seen shortly, the Inclusion of every 
cell (weighted equally) in each contrast is true of all 2 designs. 
This is important for a researcher to keep in mind when interpreting 
significant effects. In such cases, to account for an effect on 
the basis of anything less than the equal contribution of every 
cell is to commit a Type IV error* 
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Interaction in the I by J Design 

It should be noted that In the 2 by 2 factorial design there 
is only one way to define three orthogonal contrasts that relate 
to the main effect for A, the main effect for B, and their inter- 
action, Thus, the test that the parameter values are equal is 
identical to the test that the corresponding contrast is equal to 
zero. As 'soon as I or J exceeds 2, this last statement is no 
longer true since there is an infinite number of ways to partition 
the sum of squares associated with an effect that has three or more 
levels. This means that the test of equal parameter values does 
not have a simple counterpart in a test stating that a specified 
contrast is equal to zero. Instead, the correspondence must be made 
by a statement relating all possible contrasts as being equal to 
zero. To illustrate this point, consider a 2 by 3 design, as 
displayed in Table 3. 

Insert Table 3 about here 

Since this design consists of six cells, five degrees of freedom 
are available for partitioning the between groups sum of squares. 

One possible set of five orthogonal contrasts which may be tested 

5 

by this factorial design is presented in Table 4 . Since I«2, the 

Insert Table 4 about here 
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first column of Table 4 is similar to the first column of Table 2. 
Differences among the three levels of Factor B are tested by means 
of two contrasts, each representing one degree of freedom. The 
contrast in the second column compares Level 1 and Level 2 of Factor 
B, while the contrast in the third column compares Level 3 of Factor 
B with Levels 1 and 2 combined. The interaction effects are meas- 
ured by means of the contrasts defined in the fourth and fifth columns. 
The contrast in the fourth column is found by multiplying the 
coefficients of the first column by those of the second, producing 
a contrast that measures the differential effect of Levels 1 and 2 
of Factor B at the two levels of Factor A. Finally, the fifth 
column is found by multiplying the coefficients of the first and 
third columns. This contrast measures the differential effect of 
the combined first two levels and Level 3 of Factor B at the two 
levels of Factor A. 

Any of the five contrasts which were of interest to the 
researcher could be evaluated as planned comparisons in the manner 
described in a following section. The important point to note in 
this discussion is that the interaction contrasts are defined 
by more than two cells of the design (which will be true for all 
interaction contrasts), thereby indicating the inappropriateness 
of attempts to interpret significant interactions strictly on the 
basis of pairwise or nested statistical comparisons of cell means. 

A A 

Moreover, the expected values of both 4* and 4* Indicate that 
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they are Indeed true Interaction contrasts, since: 

e( ’ , ab 1 ) ■ E( yu> ■ E( i r i2 ) " E <y2i> + E( y22^ 

- (p + + 6 X + Y u ) - <P + Oj. + ftj + Y 12 ) 

- (p + ag + + Y Z1 ) + (P + Y 2 + 6 2 + Y 22 ) 

° Y 11 - Y 12 - Y 21 + Y 22 and 
E(i AB ) - E(y n ) + E(7 12 > - 2E(7 13 ) - e(7 21 ) - e(7 22 ) 

+ 2E(y 23 ) 

■ (p + + 8^ + Yjj) + (P + + ^ 2 + Y 22 ) - 2(P + a 2 + Y^ 3 ) 

- <P + “ 2 + B 1 + Y 21> ‘ O + “ 2 + B 2 + Y 22> 

+ 2(y + «2 + + y 23^ 

“ Y 11 + Y 12 “ 2Y 13 " Y 21 ~ Y 22 + 2Y 23 
which contain only interaction (y) parameters. 

Since sets of contrasts different from those of Taole 4 may 
be used to partition the individual sums of squares for the B 
factor and the interaction, it follows that the tests of H 02 : 

h “ 3 2 * 3 3 “ 0 and H 03 : y 11 " y 12 “ ••• " Y 23 “ 0 are not equiv ~ 

alent to the tests of Hq 2 ! HL ■ 0» * 0 and ** 

1 2 1 
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ur „ ■ 0 as was the case in the 2 by 2 design. As soon as X or 

t AxB 2 

J exceeds two, the statistical hypothesis of equal parameter values 
is equivalent to the hypothesis that states that all contrasts 
are equal to zero. In this sense, Hq 2 and are equivalent to 

the hypotheses Hq 2 : All ¥g ** 0 and Hq^: All 83 0. Because 

these are the hypotheses that are actually tested in the analysis 
of variance, one may always use Scheffers method following a 
rejected overall hypothesis provided that the appropriate type 
of contrast has been defined, and that the appropriate Scheffe' 
coefficient is selected. Any number of contrasts may be examined 
as long as the expected value of the contrast reduces to a contrast 
in the only. 

Thus, if H q3 , the overall test of interaction is performed and 
if the hypothesis is rejected, then Scheffer’s method will guarantee 
Type I error protection for all contrasts investigated, but only 
(a) if they are valid Interaction contrasts, and (b) if the Scheffe" 
coefficient is based on the degrees of freedom associated with 
the test of H^. Valid interaction contrasts would include tests 
of interaction effects being different from zero (y^j “ 0) , or from 
one another (y^. * y ± ,y) 9 as well as differences between row or 
column differences (A^ D A^) as described by Marascuilo and Levin 
(1970). In addition, any contrast in the cell means may be studied, 
provided that it is shown to be a contrast involving the only. 
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This discussion suggests that future textbook writers might 

well denote analysis of variance F-test hypotheses as HqJ All V » 0. 

K 

Interactions In 2 Factorial Designs 

Once the general model of a two-factor design is understood, 

It becomes quite easy to extend the discussion of Interactions to 
more complex designs. 

The extension simply involves the addition of more rows and 
columns to the contrast matrix. For this extension, consider a 
2 by 2 by 2 design, as represented in Table 5, In this case, the 
first, second, and third subscripts refer to the levels of Factors 

Insert Table 5 about here 

A,B, and C respectively. The seven between-group contrast coeff- 
icients corresponding to this design are shown in Table 6. Note 
that, as before, each of the first-order (two-factor) interaction 



Insert Table 6 about here 

contrast vectors may be generated by obtaining the products of 
the constituent factors. Similarly, the A x B x C second-order 
interaction is the product of the A,B, and C main effect contrast 
vectors. 
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In examining the coefficients of the A x B x C interaction 
more closely, it will be noticed that the first four coefficients 
represent the B x C interaction for A^, while the last four 
coefficients represent the B x C interaction for A 2 . The difference 
between the B x C interaction at Levels A^ and A 2 may be written 
as follows: 

*ABC ’ [(+1) ym + ( - 1)5 m + + (+1) yi22 ] 

- [(+Wy 211 + <-l)y 212 + <-«y 221 + (+Dy 222 ) 

■ c+Dy m + <-Dy n2 + <- 1 )yi2i + ^ +1 'yi22 + 

+ (+i)y 212 + ( +1 )y 22 i + (-1) y222 

which is recognized as the contrast defined in the last column 

A 

of Table 6. It is easy to show that is an unbiased estimate of 

V ABC ° ^111 “ y 112 " ^121 + ^122 “ *211 + Y 2 12 + ^221 ~ Y 222 » 

a contrast involving the interaction parameters only. 

With one-degree-of-freedom tests, the F-test always corresponds 
to the contrast examined and to its post hoc discussion. Should 
the three-factor interaction of Table 6 prove to be statistically 
significant, then the effect is immediately traceable to the equally 
weighted linear contrast defined in the last column, and not to 
any Isolated cell or cells of the total design. Instead, the 
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Interpretation must Include every cell of the design because 
the test of Interaction Is Identical to the test of H n 5 „ a 0 



versus 



S« 0 






Interactions In the I by J by K or Higher-Order Designs 

When I,J, or K exceeds 2, then the corresponding analysis 
of variance hypotheses cannot be stated In terms of a single 
contrast* In this case, the hypotheses again relate to all possible 
contrasts that could be generated under the model. Thus, for the 
three-factor interaction, the classical F-test actually tests the 
hypothesis HqJ All interaction contrasts are identically equal 
to zero. The alternative hypothesis Is given by At least one 
interaction contrast Is different from zero* 

If Hq is rejected, confidence Intervals may be built around 
the Individual or around linear combinations of the various 

cells that are indeed true Interaction contrasts (not simple 
comparisons among cell means). Type IV errors are readily avoided 
by limiting one’s verbal interpretation of the interaction to those 
true interaction comparisons for which the Scheffe'* post hoc 
confidence interval does not include zero. 

Interactions as Planned Comparisons 

Some researchers have a number of misconceptions concerning 
the partitioning of the sum of squares in complex designs. Whereas 
most know that they may generate one-degree-of-freedom tests 
"within" a main effect that may be tested as planned orthogonal 
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comparisons, they fail to realize that a partitioning of the 
interaction sum of squares is also possible when planned contrasts 
(in the form of those in Table 4) are specified. The prevailing 
belief is that while main effects may be decomposed into one- 
degree-of-freedom contrasts, interactions must be assessed in the 
context of multi-degree-of-freedom omnibus F-tests. That this is 
not true is illustrated in the following example. 

Consider a cross-sectional study consisting of two factors, 

Sex (male and female) and Age (6,8,10, and 12 years of age), as 
portrayed in Table 7. Further suppose that an investigator has 

tm mm «— mm mm mmmmmm mm mm mrnmm mm mm mrnmmmm mm mm mm mmmm mmmmmm 

Insert Table 7 about here 

reason to believe that a certain cognitive ability is of such a 
nature that in the primary grades there is a large sex difference 
in favor of girls, but that the difference diminishes over the 
elementary school years. This statement has the flavor of an 
interaction hypothesis which could be evaluated on a post hoc 
basis (as outlined in the preceding sections) following the re- 
jection of the hypothesis of no interaction with a statistical 
test based on three degrees of- freedom for the numerator. 

But reconsider the investigator’s hypothesis. On the surface, 
it appears that the hypothesis states that the mean profile for 
the boys over the four age levels is not parallel to the corres- 
ponding profile for the girls. Actually, it is more explicit than 
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that. It states that a relatively large initial girl-boy difference 
will be observed that will decrease as age increases. If the 
investigator f s hypothesis is correct, then symbolically: 

("ll - V 21 ) > <Vi2 - V 22 ) > <P 13 “ U 23 ) > <1*14 ” 

In this case, it is easy to relate the interaction hypothesis 
to an interaction test for trend using the coefficients for linear, 
quadratic and cubic components. By referring to a standard table 
of orthogonal polynomials, as in Hays (1963) or in Kirk (1968), 
one may use the same coefficients that test for trend within the 
main effects sum of squares to test the trend interaction hypothesis 
within the interaction sum of squares. The contrast matrix appropriate 
for testing this is presented in Table 8. 



Insert Table 8 about here 



The first column defines the contrast for comparing the 
girls* and boys* overall (across age) performance. The next 
three columns constitute three orthogonal contrasts that test 
for the main effect of age by means of a trend analysis for linear, 
quadratic, and cubic components. (Depending on the researcher* s 
hypothesis regarding the age main effect, the three trend contrasts 
would be tested either individually or collectively.) These 
coefficients are read directly from Table VI of. Hays (1963) . The 
last three columns are generated from the first four: Column 5 

is found by multiplying the coefficients of Columns 1 and 2 to produce 
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the linear Sex by Age interaction contrast, Column 6 is the product 
of Columns 1 and 3, while Column 7 io the product of Columns 1 and 4. 
These latter two sets of coefficients define the quadratic and the cubic 
Sex by Age interaction contrasts respectively. 

In this example, the contrast of primary interest is defined 
by the coefficients of Column 5. This contrast is given by: 

^SxA(linear) “ ^ll + + (+1)y 13 + (+3)y 14 + (+3)y 21 

+ (+l)y 2 2 + (-^23 + <- 3 >y 2 4 

which may be written as: 

Wlinear) “ +1 ^13*^2 3 } 

which is seen to have the same basic form as that used for the 
linear trend for main effects except that the coefficients in this 
case are applied to the mean sex difference at each of the four 

A 

age levels. In addition, it should be noted that ^g^mnear) 

A 

a valid interaction contrast. In the mathematical model, 
is an estimate of: 

^SxA(linear) " _3 ^ + “l + B 1 + y H “ w " *2 _ B 1 ‘ Y 21 ) 

-l(y + + e 2 + y 12 - V - « 2 - e 2 - Y 22 ) 

+1(U + o x + 83 + r 13 - P - « 2 - 6 3 - Y 23 ) 

+3(y + o x + B 4 + y 14 - V - o 2 - - y 24 ) 

“ -3(y 11 - y 21 )-1(y 12 - Y 22 )+1(y 13 - y 23 )+3(y X4 ~ Y 24> 
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which contains only interaction parameters and, thus, is not con- 
founded with other effects. 

Some hypothetical data and an ANOVA based on them may be found 
in Tables 9 and 10. Calculations of the sums of squares for each source 



Insert Table 9 about here 



Insert Table 10 about here 



of variance and for the planned interaction comparison are as follows: 



_ — 9 

SS - Jn T (y- - y V 

s 1b1 1. 

« 4(6) [(18.0 - 15.5) 2 + (13.0 - 15. 5) 2 ] « 24[(-2.5) 2 + (2.5) 2 ] 

- 24(12.5) - 300 

ss a " I n I ry > 2 

* J * * 

■* 2(6) [10.0 - 15.5) 2 + (15.0 - 15. 5) 2 + (17.5 - 15. 5) 2 
+ (19.5 - 15.5) 2 ] 

- 12[(-5.5) 2 + (-.5) 2 + (2.0) 2 + (4.0) 2 ] 

« 12(50.5) « 606 



ss_ 

SxA 



n 



I 

l 

1=1 



J 

I 



(y 



ij 





+ y ) 2 

• • 



- 6 [ (14.0 - 18.0 - 10.0 + 15. 5) 2 + (6.0 - 13.0 
+ ... + (19.0 - 13.0 - 19.5 + 15.5) 2 ] 

» 6(13) » 78 



10.0 + 15. 5) 2 




24 



24 



'‘W(Unear) “ - 3<y ll “ y 21> - 1(y 12 “ y 22> +1(y 13 * y 23> 

+3(y 14 - y 24> 

» -3(14-6) -1(18-12) +1(20-15) +3(20-19) 

» -3(8) -1(6) +1(5) +3(1) 

* 22 



SS 



^SxA( linear) 



A 2 

- .rs&afrsss)? — Aim i 6(484) 

llc Z (~3) 2 +(~3) 2 +(~l) 2 +...+(+3) 2 40 

1«1 j**l ^ 



« 72.6 

SS„ . / . , . » SS A + SS A 

(rema n er) ^SxA (quadratic) ^SxA(cublc) 



■ SS_ . » SS A 

SxA ^SxA.( linear) 

- 78.0-72.6 

- 5.4 

A 

Clearly, ^gxAdinear) % &te t0 h ear t of the investigator's 

query, l.e., whether there exists a decreasing sex difference in 
the cognitive ability as a function of increasing age. His question 
is evaluated statistically by weighting the four girl-boy differences 
by the appropriate coefficients that are related to linear trend. 

If the investigator were interested in other characteristics of the 
girl-boy differences, the higher order trend components could be 
examined individually. 
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If the factor, Sex: {males, females} were replaced by a 
factor with more than two levels, e.g., {Social Class: high, middle, 

low} , then defining interesting planned orthogonal contrasts might 
become a little more difficult. However, if the interaction were 
assessed via the omnibus (with 6 df) test that: All ¥g xSC “ 0, 

and if such a test were statistically significant, then contrasts 
could be defined to compare the mean linear high-low, high-middle, 
and middle- low differences, using Scheffers procedure. Note that 
the orthogonality restriction on contrasts is relevant only insofar 
as partitioning sums of squares into nonoverlapping pieces for 
hypothesis testing on an a priori basis is concerned. However, if 
the entire set of contrasts is tested collectively and if such a 
test produces a significant JF, then Scheffer’s method may be applied 
to all comparisons — orthogonal and non-orthogonal alike — that 
strike the investigator’s fancy, as long as they represent true 
contrasts among the parameters Indicated in the initial test. 

It is worth mentioning that the hypothesis of mean girl-boy 
differences predicts that the differences will decrease as age 
increases. This is certainly in the mode of a directional hypothesis 
and therefore to achieve maximum statistical power, it should be 
analyzed as a directional (one-tailed) alternative. Since the 
hypothesis is related to a linear contrast, one may perform the test 
by use of the Student t-distribution, by means of the test statistic: 



0 




26 



26 



t 



^SxA( linear) 

A 

^SxA( linear) 



-3(y u -y 21 ) -l(yi 2 -y 2 2)+l(yi3- y 2 3)4- 3 ( yi 4-y 24 ) 

■ ■ ■ - ^ ■■ i* i «i ■ • M ' i ■ ■ i ■■ ■■■«! ■ nm— ■■ i*l ■— «■» ■ i .. ■». ■■ ■■ 1 

[ (-3) 2 +(-3) 2 +(-l) 2 +(-l) 2 +(+l) 2 +(+l) 2 +(+3) 2 +(+3' 



which is simply the square root of the F-ratio based on the same 
contrast. If the investigator's claim is true, then one would expect 
that: 

$iiyu> > Gvthz* > > 

which when weighted by the above coefficients would produce a negative 

value of t. Thus, as a one-tailed test, the hypothesis H^: ^gx^ (linear)* 3 ® 

should be rejected if the observed t < t (a ) , where t (u) is the 

V 2 v 2 

critical value of _t, based on the degrees of freedom associated with 
MS £ , at the a (100) percentile. 

Interaction in The 2 by 2 Intuitive ANOVA Design 

The basic argument presented in this paper is that interaction 
contrasts defined following a significant jF-ratio must include more 
than two cells of the design and further, must reduce, to a contrast 
involving the interaction parameters only. Thus, in all of the 
examples presented, the contrasts examined have been defined in 
such a way that a linear combination of the cell means was really 
estimating some linear combination of the Y^j* This was also true 
for the 2 by 2 design in which it was seen that the only contrast 
associated with a significant interaction was given by: 
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*AB ” (+1) *ll + (-1) *12 + (_1) ^21 + (+1)7 22 

“ ^11 + *22> - ^12 + V 21> 

It is worth noting that the form of this contrast is independent 
of the mathematical ANOVA model and would thereby be encountered under 
the intuitive ANOVA model discussed by Marascuilo and Levin (1970) • 
Marascuilo and Levin suggested that many behavioral scientists 
"intuit" a statistical interaction in much the same way that pharm- 
acists view the joint cumulative effects of two drugs when taken 
together. For example, neither, either, or both of two drugs (A and B) 
might be administered to four independent groups of Ss as follows: 

Group Drug Treatment 



I 


Placebo 


II 


Drug A 


III 


Drug B 


IV 


Drugs A and B 



One may represent the intuitive model by means of the factorial design 
in Table 11. A brief inspection of the four drug treatment combinations 

Insert Table 11 about here 

indicates why Interactions are intuitively traced to a single 
treatment or cell. In this model, a statistically significant 
interaction component is immediately attributed to the responses of 
the subjects in Group IV. However, in order to estimate the magnitude 
of the interaction, or y, component at least one comparison would 
have to be made within the design. 
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At first glance, it might seem that an estimate of y could be 
obtained by comparing the average response of Group IV with the 
pooled average of Groups II and III by means of the contrast: 

V ”V 22 - 1/2(7 12 + y 21 > 

According to the algebra of expected values: 

E(?) - e (y 2 2> -l/2E(y 12 ) -l/2E(y 21 ) 

®p + a+8 +y-1/2(p + a) -l/2(y + 8) * + 1/2 (a + 8) 

Hence, it is clear that this contrast does not do the job, since 

the interaction effect is partially confounded with the a and 8 effects. 

Alternatively, it might be decided not to average the response 
of groups II and III, but to evaluate the interaction by means of: 

A 

» = y 22 - (y 12 + y 21 ) 

which in this case provides an unbiased estimate of: 

A 

ECO « Y - V 

Unfortunately, this estimate of y is biased, in that it tends 
to underestimate the effect of y by the amount p . Moreover, 
the linear combination considered is not a legitimate contrast 
since the sum of the coefficients does not add to zero. However, 
it can be modified to form a contrast by considering the sum of 
the averages in Groups I and IV as contrasted with the sum of the 
averages in Groups II and III. With the contrast: 
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* " (y ll + y 22> " < y 12 + y 21> 
it is seen that: 

E(y) *= (y+y + a+3+y) - (y+a+y+3) *= y 

Clearly, this is the contrast that is appropriate for determining 
the magnitude of the interaction effect. It involves all four 
cells of the design in exactly the same manner as suggested when 
the 2 by 2 ANOVA design was discussed in the earlier sections of 
this paper. Thus, it is readily apparent that even though one may 
subscribe in principle to the intuitive interaction model, in 
practice when it comes to estimating and isolating the interaction 
effects, even contrasts among the parameters of the intuitive model 
reduce to exactly the same contrasts encountered in the mathematical 
ANOVA model. 

The meaning of this entire discussion on Type IV errors 
manifested by interactions in ANOVA designs should be clear for the 
behavioral scientist. Significant interactions examined as either 
planned or post hoc comparisons must be evaluated either in terms 
of the interaction parameters of the model or in terms of cell means 
that define contrasts that reduce to comparisons among the interaction 
parameters of the model. If it is seen that the expected value of 
a contrast defined in terms of cell means contains any a, 3, or y 
of the design, then it is immediately known that the contrast is 
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not a valid interaction contrast and should therefore not be 
discussed as though it were related to a significant interaction 
component. Just as tests of interactions are orthogonal to tests 
of main effects, interaction contrasts are orthogonal to main effect 
contrasts and therefore their expected values are independent of 
one another. If these principles are kept in mind and if each 
interesting interaction contrast is inspected in terms of its 
expected value, then Type IV errors in the interaction model should, 
like old soldiers, fade away. 
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Table 2 
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Two-Factor Design with 1«2 and J®2 in Terms of the 
Observed Means. 



Factor A 
B 


A 1 


A 2 


Mean 








*21 


y.i 




B 2 


*12 


7 22 


y.2 




Mean 




*2. 


y 

• • 





. Contrast Matrix for Partitioning the Sum of Squares 
in a 2 by 2 Factorial Design into Three Orthogonal 
Components Related to Main Effect for A, Main Effect 
for B, and their Interaction. 



Cell 

Mean 


A 


Contrast 

A 

y 

B 


A 

V 

AB 




1 


1 


1 


*12 


1 


-1 


-1 


*21 


-1 


1 


-1 


y 22 


-1 


-1 


1 
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Table 3. Two-Factor Design with J=*2 and J»3 in Terras of the 
Observed Means 



Factor A 
B 


A i 


A 2 


Mean 


B i 


y ll 


y 21 


y.i 


B 2 


Hi 


y 22 


y.2 


B 3 


y 13 


y 23 


y.3 


Mean 


y i. 


y 2. 


y 

• • 



Table 4. Contrast Matrix for Partitioning the Sum of Squares in a 
2 by 3 Factorial Design into Five Orthogonal Components 
related to Main Effect for A, Main Effect for B, and 
their Interactions. 



Cell 

Mean 


A 

*A 


A 

V 

B i 


Contrast 

B 2 


A 

¥ 

AxB^ 


A 

AxB 2 


7u 


1 


1 


1 


1 


1 


y i2 


1 


-1 


1 


-1 


1 


. 


y 13 


1 


0 


-2 


0 


-2 


y 21 


-1 


1 


1 


-1 


-1 


MM* 


y 22 


-1 


-1 


1 


1 


-1 


-- rTT 


y 23 


-1 


0 


-2 


0 


2 




Table 5. Three-Factor Design with I«2, J«2, and K=2 in Terms of 
the Observed Means 



Factor g 
C 


A 1 

B 1 


B 2 


A 2 

B 1 


B 2 


c i 


^111 


*121 


y 211 


y 221 


C 2 


y 112 


y 122 


y 212 


y 222 



Table 6. Contrast Matrix for Partitioning the Sura of Squares 
in a 2 by 2 Factorial Design into Seven Orthogonal 
Components Related to Main Effects for A, B, and C 
and their Interactions. 



Cell 

Mean 


A 

*A 


A 


A 


Contrast A 

m Uf 

AB AC 


A 

*BC 


A 

4* 

ABC 


y m 


1 


1 


1 


1 


1 


1 


1 


y 112 


1 


1 


-1 


1 


-1 


-1 


-1 


y 121 


1 


-1 


1 


-1 


1 


-1 


-1 


y 122 


1 


-1 


-1 


-1 


-1 


1 


1 


y 211 


-1 


1 


1 


-1 


-1 


1 


-1 


y 212 


-1 


1 


-1 


-1 


1 


-1 


1 


y 221 


-1 


-1 


1 


1 


-1 


-1 


1 


y 222 


-1 


-1 


-1 


1 


1 


1 


-1 
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Table 7. Two-Factor Design of Sex by Age. 



Factor Sex 
Age 


Girls 


Boys 


6 years 


y u 


y 21 


8 years 


y 12 


y 22 


10 years 


y 13 


y 23 


12 years 


y 14 


y 24 



Table 


8. 


Contrast Matrix for Table 
and Cubic Trends. 


7 Based on Tests 


fo* Linear, Quadratic, 


Cell 

Mean 


> 

CD 


A 

linear) 


A 

A (quad.) 


Contrast 

A A A 

^ACcubic) ^SxAC linear) ^SxA(quad.) 


A 

^SxACcubic) 


y n 


1 


-3 


1 


-1 


-3 


:1 


-1 


y 12 


1 


-1 


-1 


3 


-1 


-1 


3 


y 13 


1 


1 


-1 


-3 


1 


-1 


-3 


y 14 


1 


3 


1 


1 


3 


1 


1 


y 21 


-1 


-3 


1 


-1 


3 


-1 


1 


y 22 


-1 


-1 


-1 


3 


1 


1 


-3 


y 23 


-1 


1 


-1 


-3 


-1 


1 


3 


y 24 


-1 


3 


1 


1 


-3 


-1 


-1 
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Table 9. Hypothetical Performance on a Cognitive Task, by Boys 
and Girls at Four Age Levels 



Factor Sex 
Age 


Girls 


Boys 


Across Sex 


6 


14 


6 


10.0 


8 


18 


12 


15.0 


10 


20 


15 


17.5 


12 


20 


19 


19.5 


Across Age 


18.0 


13.0 


15.5 



Note: There are 6 Ss per cell (n®6) , and the mean square 

error (MS-,) associated with these data is 16.0. 



Table 10. Analysis of Variance Table for the Data in Table 9, 
including a Planned Interaction Contrast, 



Source 


df 


SS 


MS 


Sex 


1 


300 


300 


Age 


3 


606 


202 


Sex by Age 


3 


78 




U/ 

SxA(linear) 


1 


72.6 


72.6 


w 

SxA (remainder) 


2 


5.4 


2.7 


Error 


40 


640 


16.0 


Note: The sums of 


squares 


are based on 


the means in 



Table 9, which are proportional to those obtained 
with Table 8 f s coefficients. 
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Table 11, The Intuitive 2 by 2 Design 
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Footnotes 



1 

Readers who are still wondering about Type III errors may refer 
to assorted definitions reviewed by Marascuilo and Levin (1970) . 

2 

The authors are grateful to Professors Maryellen McSweeney, Neil 
H. Timm, and M.I. Charles E. Woodson for reading an initial draft 
of this paper, and recommending several helpful modifications. 

3 

"Appropriate" is used advisedly here, in the sense that the Scheffe'' 
procedure is the only procedure that corresponds exactly to the 
initial test of hypothesis. Whether Scheffer’s procedure is 
desirable (with respect to statistical power, for example) is another 
issue which has been discussed elsewhere (e.g., Petrinovich and 
Hardy ck, 1969). 

4 

Although the discussion and examples throughout this paper will be 
based on the assumption of equal cell n’s, the same general 
principles may be extended to designs with unequal cell frequencies. 



5 

Note that even though the two contrasts for the B factor are orthogonal 
in this case, the orthogonality restrictions of a factorial design 
apply only to between source (i.e., main effects and interaction) 
sets of contrasts . For a more comprehensive treatment of contrasts 
and the general linear model, Mendenhall's (1968) book is an 
excellent source. 
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